Journal of Vision
● Association for Research in Vision and Ophthalmology (ARVO)
Preprints posted in the last 90 days, ranked by how well they match Journal of Vision's content profile, based on 92 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.
Li, L.; Landy, M. S.
Show abstract
Sensory representations are inherently noisy, and monitoring this noise is essential for effective decision-making. This metacognitive ability of evaluating the quality of ones perceptual decision is referred to as perceptual confidence. However, whether perceptual confidence accurately tracks internal noise remains unresolved. Peripheral vision provides a natural testing ground for this question, yet previous studies report mixed results complicated by different definitions and measurements of confidence. Here, we used a normative Bayesian framework with incentivized confidence measurements to address these discrepancies. We tested the Bayesian-confidence hypothesis that confidence is derived from the posterior probability distribution of the feature being judged, given noisy sensory measurements. We tested two perceptual tasks while varying stimulus eccentricity: spatial localization and orientation estimation. We measured confidence by post-decision wagering, by which participants set a symmetrical range around the perceptual estimates. Participants earned higher reward for narrower confidence ranges but received zero reward if the range did not enclose the target. We estimated sensory noise from the perceptual responses to predict confidence, assuming that sensory noise linearly increases with eccentricity. We then compared a normative Bayesian model with three alternative models that challenged different assumptions. Across both tasks, the Bayesian ideal-observer model best predicted confidence. These results suggest that humans can accurately monitor the increased internal noise in peripheral vision and use this information to make optimal confidence judgments.
Morimoto, T.; Lee, R. J.; Smithson, H. E.
Show abstract
Color constancy allows us to perceive stable object colors under different lighting conditions by reducing the impact of lighting. Information about illuminant color could be derived from a white surface or a specular highlight. The "brightest is white" heuristic has been frequently incorporated in illumination estimation models, to identify illuminant color. Here, we tested an alternative hypothesis: we use structured changes in the proximal image to identify highlight regions, even when they are not the brightest elements in the scene. In computer-rendered scenes, we varied the reliability of "brightest element" and "highlight geometry" cues, testing their effect on a color constancy task. Each scene had a single spherical surface lit by several point lights with identical spectral properties. The surface had a uniform spectral reflectance but a noise texture that attenuated the reflectance by a variable scale factor. We tested three levels of specularity: zero (matte), low, and mid. Observers watched a 1.5-second animation and responded if color changes were due to illuminant or material changes. Discrimination performance for matte surfaces was nearly at chance level, as predicted. However, as specularity increased, performance improved significantly. Observers outperformed an ideal observer model who relied solely on the brightest element. Notably, when the specular region appeared on a dark part of the texture, observer performance improved even more--even though the brightest element heuristic would predict a decrease. When specular geometries were difficult to identify due to phase scrambling, observer performance significantly dropped. These results suggest that we do not simply rely on the brightest element, but rather utilize regularities of diffuse and specular components of the proximal image to solve surface and illuminant ambiguities.
Eicke-Kanani, L.; Tatai, F.; Rosenberger, L.; Schmitter, C.; Straube, B.; Wallis, T. S.
Show abstract
Michottes "launching displays" are animations of collision-like interactions between two objects that elicit a stable and robust impression that one object, the launcher, caused another object, the target, to move. Although it is well-known that unexpected disruptions of movement continuation between launcher and target decrease causal impressions in centre-to-centre collisions, the role of observers visual uncertainty around predicted moving trajectories remains relatively unexplored. In this work, we (1) assess observers uncertainty around post-collision moving angles in a trajectory prediction task and (2) collect their causal impression in a causality rating task. In the latter task, observers viewed centre-to-centre collisions with different levels of movement continuity between the launcher and the target disc. By presenting different launch orientations, we exploited the well-known oblique effect to vary trajectory prediction uncertainty within individuals. If observers rely on their trajectory predictions to rate the causality of the collision, we expect their accuracy in (1) to have a systematic influence on their causality rating in (2). We replicate previous findings that observers report stronger causal impressions in trials where the target and the launcher move in the same direction and weaker causal impressions for collisions where the target and the launcher moving trajectory deviated. Furthermore, causality ratings were on average higher for oblique compared to cardinal launch directions, implying that increased sensory uncertainty induces a stronger causal impression. We hope this work will inspire deeper empirical assessments and computational models describing the role of sensory uncertainty and predictive processes in shaping subjective impressions of causality.
Li, Y. H.; Mizobuchi, S.; Wang, J. Z.; Rucci, M.
Show abstract
During natural fixation, ocular drifts continually modulate the input to the retina. Previous studies have shown that this motion enhances sensitivity to fine spatial detail, a conclusion supported by findings of reduced sensitivity to high--but not low--spatial frequencies when stimuli are immobilized on the retina for brief periods of time. Most prior retinal-stabilization studies have relied on fast-phosphor cathode ray tube (CRT) displays or adaptive optics scanning laser ophthalmoscopes (AOSLOs), both of which deliver temporally pulsed stimulation. This raises the question of whether stimulus flicker contributed to the previously observed perceptual impairments under retinal stabilization. Here, we replicate stabilization experiments using two types of fast displays that provide more continuous stimulation: liquid-crystal display (LCD) and organic light-emitting diode (OLED) monitors. We again find an impairment in sensitivity to high spatial frequencies under retinal stabilization. Analyses of the retinal input confirm high-quality stabilization within the temporal bandwidth of human vision. These results show that retinal-stabilization effects are robust across display technologies and are little affected by the specific dynamics of modern displays.
Nakamura, A.; Luo, J.; Yokoi, I.; Takemura, H.
Show abstract
Visual perception of symbolic numerals is essential for everyday tasks; however, the neural and perceptual mechanisms underlying this ability remain unclear. Partially occluded digital numerals can elicit bistable perception, and adaptation to symbolic numerals alters the perception of these ambiguous stimuli. We aimed to examine how symbolic numeral adaptation is related to hierarchical visual processing by testing its interocular and interhemifield transfer. Experiment 1 tested interocular transfer by presenting the test stimulus to either the same or opposite eye as the adaptation stimulus. Experiment 2 assessed interhemifield transfer by presenting the test stimulus to either the same or opposite hemifield as the adaptation stimulus. Experiment 3 examined the interhemifield transfer of adaptation confined to the upper parts of digital numerals. Our results showed that adaptation to digital numerals induced shifted perceptual interpretations that transferred across eyes. In addition, we found that adaptation to digital numerals induced a relatively small but statistically significant interhemifield transfer. In contrast, adaptation restricted to the upper parts of digital numerals showed no significant interhemifield transfer. These findings suggest that the perceptual interpretation of symbolic numerals involves visual processing stages that integrate information across the eyes and hemifields.
Tailor-Hamblin, V. K.; Theodorou, M.; Dahlmann-Noor, A.; Dekker, T. M.; Greenwood, J. A.
Show abstract
PurposeFoveal vision in individuals with albinism is impaired not only by reduced visual acuity but also by elevated crowding - the disruption of object recognition in clutter. Because albinism is characterised by both retinal underdevelopment and nystagmus (uncontrolled eye movements), it is unclear whether crowding is elevated primarily from image motion due to eye movements or an additional sensory deficit. To disentangle these factors, we examined the spatial and featural selectivity of foveal crowding in albinism, comparing performance with controls and prior data from individuals with idiopathic infantile nystagmus syndrome (IINS), where nystagmus occurs without retinal underdevelopment. MethodsAdults with albinism (n=8) and age-matched controls (n=8; 19-49 years) identified the orientation of foveal Landolt-C targets. In Experiment 1, targets were presented alone or flanked horizontally or vertically to assess spatial selectivity. In Experiment 2, flankers were of the same or opposite contrast polarity to assess featural selectivity. Stimulus size was adaptively scaled using QUEST to estimate gap-size thresholds. ResultsCrowding was substantially elevated in albinism, relative to both controls and IINS. Experiment 1 revealed stronger crowding for horizontally than vertically positioned flankers in albinism, mirroring the predominant direction of nystagmic eye movements. In Experiment 2, opposite-polarity flankers did not reduce crowding, indicating an absence of selectivity for target-flanker similarity. ConclusionsFoveal crowding in albinism is markedly elevated, with a nystagmus-related spatial anisotropy and a lack of featural selectivity. These characteristics suggest that these elevations reflect both retinal image motion and a substantial sensory deficit arising from abnormal visual development.
Rinaldi, F. G.; Piasini, E.
Show abstract
To make sense of a noisy world, living beings constantly face decisions between competing interpretations for ambiguous sensory data. This process parallels statistical model selection, where most frameworks, like the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), are based on a trade-off between a models goodness-of-fit and its complexity. The same tradeoff has been observed in humans. However, a core tenet of normative frameworks is that the trade-off should depend on the sample size (N); as more data becomes available, the goodness-of-fit grows faster than the complexity penalty, weakening the overall bias towards simplicity. It is unknown whether humans also conform to this scaling principle, and if so, whether it arises from an optimal computation or a simpler heuristic. Here, we investigate this question using a preregistered visual task where subjects inferred the number of latent Gaussian sources generating clusters of data-points, and where the number of points (N) presented on each trial is varied systematically. We use three kinds of model to describe their behavior: a model with linear scaling of evidence in N (as in BIC and AIC), a model with no scaling, and a model with sub-linear scaling inspired by known biases in numerosity perception. Our results demonstrate that the normative, linear scaling model provides the worst account of human behavior. Instead, we find strong evidence for a sub-linear scaling of sample size. By inferring the shape of this scaling with Gaussian Processes, we reveal a distinct logarithmic trend for smaller N, and a flattening for higher values, both consistent with numerosity perception biases. This finding suggests that, when selecting for competing explanations for sensory data, humans do not perform a principled computation but rather employ a more efficient heuristic that repurposes lower perceptual mechanisms to dynamically weight evidence against model complexity.
Ajith, S.; Kaiser, D.; Yeh, L.-C.
Show abstract
Real-world visual search is often performed at the category level: we search for shoes or bags without knowing their exact features in advance. This requires categorical search templates that accommodate the inherent variability within the target category. Here, we examine how the variability in search templates across categories constrains visual search performance. We quantify template variability by measuring variability in object drawings from a large online dataset (Experiment 1) and from a controlled lab-based drawing task (Experiment 2) and in turn relate this variability to performance in categorical search. Across both experiments, higher category variability, and thus broader search templates, were associated with slower responses. Moreover, the observers most prioritized object template predicted their search performance better than other observers templates, indicating that individual differences in template variability shape visual search. Together, our findings demonstrate that naturalistic visual search is governed by structured variability across both object categories and observers.
Engeser, M.; Babaei, N.; Kaiser, D.
Show abstract
Each individual person looks at natural scenes in their own unique way, resulting in a distinct perceptual experience of the world. However, little is known about why such differences in gaze emerge. Here, we test the hypothesis that idiosyncrasies in gaze behavior are predicted by inter-subject variations in internal models--expectations about how scenes typically look. In two experiments, we first characterized participants personal internal models by asking them to draw typical bathroom and kitchen scenes. Individual differences in these drawings were quantified using an objective deep learning pipeline and, in turn, related to individual differences in gaze behavior. In Experiment 1, where participants freely viewed a set of kitchen and bathroom photographs, inter-subject similarities in internal models did not predict inter-subject similarities in gaze. In Experiment 2, we encouraged strategic exploration through gaze-contingent viewing and a memory task. Here, inter-subject similarities in internal models predicted similarities in fixation frequency and the sequence in which different object categories were inspected. These findings suggest that the influence of internal models on visual exploration is stronger under increased sensory uncertainty and when expectation-guided sampling of the environment is encouraged. Together, our results provide new insights into how individual expectations shape gaze behavior and help explain why people differ in how they explore the visual world.
Tian, K. J.; Motzer, J. A.; Denison, R. N.
Show abstract
When successive stimuli occur close enough together in time, their perception can be impaired. Such impairments indicate temporal competition between successive stimuli for representational resources. Voluntary temporal attention can bias processing resources in favor of a behaviorally relevant moment, improving perception at the attended time at the expense of impairments at unattended times. However it is unclear whether these perceptual tradeoffs across time arise because voluntary temporal attention selects among actively competing stimulus representations, such as within visual working memory, or if instead, temporal attention facilitates stimulus processing prior to a competitive stage. Here we used a temporal cueing task with up to two targets in succession to test whether and how the effects of temporal attention depend on temporal competition. We found that voluntary temporal attention improved performance even in the absence of temporal competition, when only one stimulus appeared during the trial. Moreover, the magnitude of attentional enhancement was comparable with and without competition. These results suggest that voluntary temporal attention enhances perception by facilitating processing prior to a competitive stage, rather than by resolving conflicts between actively competing stimulus representations. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=126 SRC="FIGDIR/small/705419v2_ufig1.gif" ALT="Figure 1"> View larger version (20K): org.highwire.dtl.DTLVardef@1f1f8dforg.highwire.dtl.DTLVardef@10a33f1org.highwire.dtl.DTLVardef@d81cfborg.highwire.dtl.DTLVardef@56a432_HPS_FORMAT_FIGEXP M_FIG C_FIG
Rawal, A.; Wolff, M. J.; Rademaker, R. L.
Show abstract
Visual working memory allows for the brief maintenance of information to serve behavioral goals. It has been shown that when the specific action required to serve a future goal is predictable, people can flexibly change a visual memory representation to incorporate an action-based one, demonstrating the goal-oriented nature of visual working memory. Can such flexibility also be observed within the visual domain, between color and space? In this eye-tracking study, participants remembered either a centrally presented color or a spatial position around fixation. Critically, when remembering a color the response wheel was either randomly rotated, or shown at a fixed rotation, on every trial. When fixed, every target color could be associated with a predictable position on the wheel during response. Do people incorporate this added spatial information in their behavior? Participants utilized color-space associations when remembering color: Response initiation happened faster when the color wheel was fixed compared to random, irrespective of whether an action could be planned or not. Next, we showed that gaze was biased towards the position of the spatial memory target during the delay, extending previous work on gaze biases. Importantly, also when remembering a color, gaze was biased towards the anticipated position of that color on the response wheel when it was fixed. Together, our results show a behavioral benefit of added spatial information for color memory, and systematic changes in gaze that reflect flexible utilization of space.
Lipsky, T.; Ehrenzeller, C.; Ansari, G.; Pfau, K.; Harmening, W.; Wu, Z.; Pfau, M.
Show abstract
Purpose: To quantify whether fundus tracking in microperimetry improves psychometric parameter estimation (in vivo demonstration of improved stimulus-delivery precision), and to derive a psychometrically grounded criterion intensity for suprathreshold (defect-mapping) microperimetry. Methods: Twenty-five healthy volunteers underwent MAIA2-microperimetry at five loci: three outside and two inside the blind spot. Frequency-of-seeing (FoS) functions were measured in four blocks (2 tracking on; 2 tracking off). FoS-data were fit using cumulative-Gaussian psychometric functions estimating sensitivity parameters. Mixed-effect models assessed tracking effects, and posterior simulations defined the optimal criterion intensity for separating 'seeing' from 'non-seeing' loci. Results: Tracking had little effect on threshold estimates at loci outside the blind spot, but lowered threshold estimates within the blind spot (posterior median difference PMD [95% CrI] of -1.46 dB [-2.30, -0.62] at locus 4, and -1.02 dB [-1.94, -0.08] at locus 5). Tracking was associated with steeper psychometric slope parameters at loci 1-3 (PMD of -0.14 dB [-0.29, 0.01], -0.27 dB [-0.43, -0.12], and -0.22 dB [-0.40, -0.04]). Without tracking, false-positive responses were more frequent when fixation shifts displaced stimuli toward the 'seeing' retina. Simulation-based analysis identified 13 dB as nominally optimal criterion for suprathreshold microperimetry (Youden index: 0.76 [0.74, 0.79], comparable to 10 dB (0.74 [0.72, 0.76]). Conclusions: Even in healthy volunteers with stable fixation, fundus tracking measurably reduced sensitivity estimates at 'non-seeing' loci and sharpened FoS curves in the 'seeing' retina. A criterion intensity of 10 to 13 dB is a defensible choice for separating 'seeing' and 'non-seeing' retina in suprathreshold (defect-mapping) perimetry paradigms.
Koenderink, J.; van Doorn, A.; Braun, D. I.; Gegenfurtner, K. R.
Show abstract
A complete empirical characterization of color discrimination in three dimensions has long remained out of reach. Classical studies, beginning with MacAdams ellipses, provided local measurements in restricted chromatic planes, but a spatially dense and internally consistent mapping of discrimination structure across full color space has not yet been achieved. Here we present such a systematic three-dimensional measurement of color discrimination in RGB space. Eight observers measured discrimination regions at 35 reference colors distributed on a body-centered cubic lattice within the RGB cube. At each location, color differences were probed along seven orientations, yielding 14 directional extents. These measurements defined centrally symmetric convex regions that were fitted with minimum-volume ellipsoids, providing a compact description of local discrimination structure. Ellipsoids were represented as symmetric positive-definite matrices and analyzed using a Frobenius geometry, enabling normalization across observers and smooth interpolation to arbitrary locations. The resulting metric field is spatially smooth, highly structured, and remarkably consistent across observers up to an individual global scale factor. Grain size increases along the achromatic axis and exhibits systematic chromatic asymmetries. Comparison with CIEDE2000 reveals substantial agreement in overall scale variation but systematic differences in local anisotropy. Together, these data provide a coherent three-dimensional empirical mapping of color discrimination across RGB space and establish an empirical framework for perceptual color metrics.
Ekinci, M. A.; Kaiser, D.
Show abstract
When individuals view the same visual input, they often differ in their aesthetic appeal judgments, yet why people differ remains largely unclear. Here, we tested whether individual differences in aesthetic experience are linked to differences in visual exploration. In two experiments, participants watched the documentary "Home" while their eye movements were recorded. In Experiment 1, participants continuously rated aesthetic experience throughout the movie, whereas in Experiment 2, they watched the first half without a task and rated aesthetic experience only during the second half. Inter-individual similarity in gaze patterns, assessed using fixation heatmaps across time, predicted similarity in aesthetic appeal judgments in both experiments. Notably, in Experiment 2, gaze similarity during free viewing in the first half of the movie predicted similarity in aesthetic ratings during the second half, indicating that incidental eye movement patterns predict aesthetic experiences. Together, these results show that shared gaze patterns are linked to shared aesthetic experiences under naturalistic, dynamic viewing conditions.
Quirmbach, F.; Helmert, J. R.; Pannasch, S.; Dix, A.; Limanowski, J.
Show abstract
For eye-hand coordination, predictions of sensory movement consequences may already be issued, and adjusted, during action preparation. In this pre-registered study, we combined a delayed-movement paradigm with a virtual reality-based hand-eye tracking task to investigate the oculomotor correlates of planning and executing coordinated hand-eye movements under standard vs nonstandard visual hand movement feedback. We measured pupil dilation and gaze-hand tracking during action preparation and subsequent task execution, where visual movement feedback violated or matched cued expectations: Participants prepared and, after a delay period, executed hand movements. Their movements were reflected by congruent or incongruent (inverted) movements of a glove-controlled virtual hand model, which they had to follow with their gaze. In the preceding delay period, visual cues could specify the to-be-executed movement (or leave it unspecified), and the visuomotor mapping (congruent or incongruent, 75% cue validity). We found that during the delay, pupil diameter increased more strongly when the movement was pre-cued (compared to left unspecified), and when nonstandard compared to standard visual movement feedback was expected. During execution, gaze-hand tracking performance decreased under nonstandard mappings, but significantly less so when the to-be-executed movement was pre-cued. Expectation violation trials produced a strong pupil dilation, particularly when congruent (standard) visuomotor expectations were violated, but also when incongruent mappings were cued but congruent ones observed. Furthermore, expectation violation impaired tracking performance; again, stronger for pre-cued movements with standard mapping. Our results indicate that oculomotor responses during delay encode processes related to motor planning and flexible forward prediction of sensory action consequences ahead of execution, i.e. increased mental effort and expectations of sensory conflict. Moreover, the results demonstrate that the strength of these (updated) predictions affects eye-hand coordination and pupillary responses during subsequent execution of the planned action.
Altinordu, N.; Boynton, G. M.; Fine, I.
Show abstract
Color is a prominent feature of visual experience, yet humans can recognize objects easily and accurately from grayscale images. We examined whether color becomes more useful when spatial information is degraded due to blurring. Participants viewed naturalistic scenes in color or grayscale, and reported whether a named target object was present across a range of blur levels that simulated optical defocus from 0-8 diopters. With unblurred images, performance did not differ between color and grayscale conditions, but as blur increased, recognition accuracy declined. Color provided a modest but reliable advantage at higher levels of blur, suggesting that color becomes increasingly useful when optical quality is degraded. We hypothesize that the evolutionary shift towards trichromacy may have been partially driven by the need to compensate for optical degradation due to aging and/or accumulated light exposure.
Gil Rodriguez, R.; Hedjar, L.; Kilic, B.; Gegenfurtner, K.
Show abstract
In our study, we used virtual reality to investigate how the colour of an objects surroundings influences colour constancy. Using Unreal Engine, we manipulated lighting and object properties in computer-generated scenes illuminated by five different light sources and presented them through an HTC Vive Pro Eye virtual reality headset. Participants assessed colour constancy by selecting the object that best matched a neutral reference from among five differently coloured options within the scene. Our results demonstrated a significant decline in colour constancy performance when the illuminant colour was in the opposite direction to that of the local surround, highlighting the interactive effects of surround colour and illumination.
Vanni, S.; Vedele, F.; Hokkanen, H.
Show abstract
The primate retina dissects visual scenes into multiple retinocortical streams. The most numerous retinal ganglion cell (GC) types, midget and parasol cells, are further divided into ON and OFF subtypes. These four GC populations have anatomical and physiological asymmetries, which are reflected in the spike trains received by downstream circuits. Computational models of the visual cortex, however, rarely take GC signal processing into account. We have built a macaque retina simulator with the aim of providing biologically plausible spike trains for downstream visual cortex simulations. The simulator is based on realistic sampling density and receptive field size as a function of eccentricity, as well as on two distinct spatial and three temporal receptive field models. Starting from data from literature and earlier receptive field measurements, we synthetize distributions for receptive field parameters, from which the synthetic units are sampled. The models are restricted for monocular and monochromatic stimuli and follow data from the temporal hemiretina which is more isotropic. We show that the model patches conform to anatomical data not used in the reconstruction process and characterize the responses with respect to spatial and temporal contrast sensitivity functions. This simulator allows starting from a stimulus video and provides biologically plausible spike trains for the distinct unit types. This supports development of thalamocortical primate model systems of vision. In addition, it can provide a reference for more biophysical retina models. The independent parameters are housed in text files supporting reparameterization for particular macaque data or other primate species. Author summaryVisual environment provides a rich source of information, and the visual system structure and function has been studied for decades in many species, including humans. The most complex data in mammalian species are processed in the cerebral cortex, but to date we are still missing a functioning model of cortical computations. While the earlier anatomical and physiological data describe many details of the visual system, to understand the functional logic we need to numerically simulate the complex interactions within this system. To pave the way for simulating visual cortex computations, we have developed a functioning model for macaque retina. The neuroinformatics comprises a review and re-digitized existing retina data from literature, as well as statistics of earlier macaque receptive field data. Finally, we provide software which brings the collected neuroinformatics to life and allows researchers to convert visual input into biologically feasible spike trains for simulation experiments of visual cortex.
Noerenberg, W.; Schweitzer, R.; Rolfs, M.
Show abstract
Saccadic eye movements sweep the visual scene across the retina, yet the resulting motion is rarely perceived. Visual factors alone, such as the presence of static pre- and post-saccadic images, can attenuate motion perception, suggesting a masking of the motion signal during early visual processing. Here, we isolated the visual component of this reduction in motion perception using simulated saccades presented to fixating observers. Across two experiments, we manipulated motion amplitude (6-18 dva), duration, and velocity profile and measured perceived amplitude and velocity at varying masking durations. Visual masking strongly reduced perceived motion amplitude and velocity, with short halftimes ([~]15 ms) that were largely invariant across saccade amplitudes. Critically, motion following a naturalistic saccadic velocity profile was perceived as smaller and slower than constant-velocity motion matched in amplitude and duration, even without explicit masking. This additional reduction increased with both amplitude and duration. These results show that visual mechanisms alone can account for substantial motion reduction across a large range of amplitudes and demonstrate a partially separable contribution of the saccadic velocity profile, suggesting that the temporal structure of retinal motion itself supports perceptual continuity across eye movements.
Maniquet, T.; Fang, H.; Ratan Murty, N. A.; Op de Beeck, H.
Show abstract
One of the distinctive features of the human visual system is the presence in occipito-temporal cortex (OTC) of regions that show preferential activation to specific categories of visual objects. To understand how this selectivity relates to categorisation behaviour, studies have employed a distance-to-bound approach (DTB), where multivariate brain activity is used to estimate a decision boundary, from which behavioural performance can be predicted. Using this approach, correlations have been found between activity in OTC, and behavioural performance when carrying out certain categorisation tasks. However, it remains unclear what determines where in OTC this correlations can be found, and with which categorisation tasks they can be found. Here, we bridged this gap by relating category-selective regions of OTC, to behavioural performance while participants categorised images as belonging or not to their preferred categories. We adopted a more basic approach and considered simple, univariate activity, rather than relying on decoding to build our DTB. Our results show that activation in regions selective to faces (FFA & OFA), bodies (EBA), and scenes (PPA), is sufficient to predict behavioural performance while categorising images as being faces, bodies, or scenes, respectively. These results are largely consistent across reaction time and motor movements, and generalise to animacy classification. Overall, our data adds to evidence that category-selective regions in OTC can serve to guide categorisation behaviour, and underlines the validity of the DTB approach to address this relationship.